Different Similarity Measures for Text Classification Using Knn
نویسندگان
چکیده
منابع مشابه
Kernels and Similarity Measures for Text Classification
Measuring similarity between two strings is a fundamental step in text classification and other problems of information retrieval. Recently, kernel-based methods have been proposed for this task; since kernels are inner products in a feature space, they naturally induce similarity measures. Information theoretic (dis)similarities have also been the subject of recent research. This paper describ...
متن کاملText classification using similarity measures on intuitionistic fuzzy sets
An intuitionistic fuzzy set (IFS) is an extended version of a fuzzy set and is capable of representing hesitancy degrees. A framework for text classification is presented. Two main challenges are addressed: how to represent documents in terms of IFSs and how to obtain a pattern of each class from such an IFS-based representation. By using some existing similarity measures for IFSs, the proposed...
متن کاملA Hybrid Text Classification Approach Using KNN And SVM
Text classification is the process of assigning text documents based on certain categories. A classifier is used to define the appropriate class for each text document based on the input algorithm used for classification. Due to the emerging trends in the field of internet and computers ,billions of text data are processed at a given time and so there is a need for organizing these data to prov...
متن کاملAn Efficient Text Classification Using Knn and Naive Bayesian
The main objective is to propose a text classification based on the features selection and preprocessing thereby reducing the dimensionality of the Feature vector and increase the classification accuracy. Text classification is the process of assigning a document to one or more target categories, based on its contents. In the proposed method, machine learning methods for text classification is ...
متن کاملText Reuse Detection using a Composition of Text Similarity Measures
Detecting text reuse is a fundamental requirement for a variety of tasks and applications, ranging from journalistic text reuse to plagiarism detection. Text reuse is traditionally detected by computing similarity between a source text and a possibly reused text. However, existing text similarity measures exhibit a major limitation: They compute similarity only on features which can be derived ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IOSR Journal of Computer Engineering
سال: 2012
ISSN: 2278-8727,2278-0661
DOI: 10.9790/0661-0563036